Search Result

Select

Imperfect wheat kernel recognition combined with image enhancement and conventional neural network

HE Jiean, WU Xiaohong, HE Xiaohai, HU Jianrong, QIN Linbo

Journal of Computer Applications 2021, 41 (3): 911-916. DOI: 10.11772/j.issn.1001-9081.2020060864

Abstract （382）

PDF （1123KB）（695）

Save

In the practical application scenario, the wheat kernel image background is single, and the imperfect characteristics of wheat imperfect grains are mostly local features while most of the image features are not different from normal grains. In order to solve the problems, an imperfect wheat kernel recognition method based on detail Image Enhancement (IE) was proposed. Firstly, the alternate minimization algorithm was used to constrain the L0 norms of the original image in the horizontal and vertical directions to smooth the original image as the base layer, and the original image was subtracted from the base layer to obtain the detail layer of the image. Then, the detail layer was delighted and superimposed with the base layer to enhance the image. Finally, the enhanced image was used as the input of the Convolutional Neural Network (CNN), and the CNN with Batch Normalization (BN) layer was used for recognition of the image. The classic classification networks LeNet-5, ResNet-34, VGG-16 and these networks with the BN layer were used as classification networks, and the images before and after enhancement were used as input to carry out classification experiments, and the accuracy of the test set was used to evaluate the performance. Experimental results show that by adding the BN layer and using the same input, all three classic classification networks have the accuracy of the test set increased by 5 percentage points, and when using the images with enhanced detail as input, the three networks have the accuracy of the test set increased by 1 percentage point, and when the above two are used together, all the three networks obtain the accuracy of the test set improved by more than 7 percentage points.

Reference | Related Articles | Metrics

Select

Video abnormal behavior detection based on dual prediction model of appearance and motion features

LI Ziqiang, WANG Zhengyong, CHEN Honggang, LI Linyi, HE Xiaohai

Journal of Computer Applications 2021, 41 (10): 2997-3003. DOI: 10.11772/j.issn.1001-9081.2020121906

Abstract （369）

PDF （1399KB）（412）

Save

In order to make full use of appearance and motion information in video abnormal behavior detection, a Siamese network model that can capture appearance and motion information at the same time was proposed. The two branches of the network were composed of the same autoencoder structure. Several consecutive frames of RGB images were used as the input of the appearance sub-network to predict the next frame, while RGB frame difference image was used as the input of the motion sub-network to predict the future frame difference. In addition, considering one of the reasons that affected the detection effect of the prediction-based method, that is the diversity of normal samples, and the powerful "generation" ability of the autoencoder network, that is it has a good prediction effect on some abnormal samples. Therefore, a memory enhancement module that learns and stores the "prototype" features of normal samples was added between the encoder and the decoder, so that the abnormal samples were able to obtain greater prediction error. Extensive experiments were conducted on three public anomaly detection datasets Avenue, UCSD-ped2 and ShanghaiTech. Experimental results show that, compared with other video abnormal behavior detection methods based on reconstruction or prediction, the proposed method achieves better performance. Specifically, the average Area Under Curve (AUC) of the proposed method on Avenue, UCSD-ped2 and ShanghaiTech datasets reach 88.2%, 97.5% and 73.0% respectively.

Reference | Related Articles | Metrics

Select

Text-to-image synthesis method based on multi-level progressive resolution generative adversarial networks

XU Yining, HE Xiaohai, ZHANG Jin, QING Linbo

Journal of Computer Applications 2020, 40 (12): 3612-3617. DOI: 10.11772/j.issn.1001-9081.2020040575

Abstract （344）

PDF （1238KB）（348）

Save

To address the problem that the results of text-to-image synthesis tasks have wrong target structures and unclear image textures, a Multi-level Progressive Resolution Generative Adversarial Network (MPRGAN) model was proposed based on Attentional Generative Adversarial Network (AttnGAN). Firstly, a semantic separation-fusion generation module was used in low-resolution layer, and the text feature was separated into three feature vectors by the guidance of self-attention mechanism and the feature vectors were used to generate feature maps respectively. Then, the feature maps were fused into low-resolution map, and the mask images were used as semantic constraints to improve the stability of the low-resolution generator. Finally, the progressive resolution residual structure was adopted in high-resolution layers. At the same time, the word attention mechanism and pixel shuffle were combined to further improve the quality of the generated images. Experimental results showed that, the Inception Score (IS) of the proposed model reaches 4.70 and 3.53 respectively on datasets of Caltech-UCSD Birds-200-2011 (CUB-200-2011) and 102 category flower dataset (Oxford-102), which are 7.80% and 3.82% higher than those of AttnGAN, respectively. The MPRGAN model can solve the instability problem of structure generation to a certain extent, and the images generated by the proposed model is closer to the real images.

Reference | Related Articles | Metrics

Select

People counting method combined with feature map learning

YI Guoxian, XIONG Shuhua, HE Xiaohai, WU Xiaohong, ZHENG Xinbo

Journal of Computer Applications 2018, 38 (12): 3591-3595. DOI: 10.11772/j.issn.1001-9081.2018051162

Abstract （329）

PDF （841KB）（293）

Save

In order to solve the problems such as background interference, illumination variation and occlusion between targets in people counting of actual public scene videos, a new people counting method combined with feature map learning and first-order dynamic linear regression was proposed. Firstly, the mapping model of feature map between the Scale-Invariant Feature Transform (SIFT) feature of image and the target true density map was established, and the feature map containing target and background features was obtained by using aforementioned mapping model and SIFT feature. Then, according to the facts of the less background changes in the monitoring video and the relatively stable background features in the feature map, the regression model of people counting was established by the first-order dynamic linear regression from the integration of feature map and the actual number of people. Finally, the estimated number of people was obtained through the regression model. The experiments were performed on the datasets of MALL and PETS2009. The experimental results show that, compared with the cumulative attribute space method, the mean absolute error of the proposed method is reduced by 2.2%, while compared with the first-order dynamic linear regression method based on corner detection, the mean absolute error and the mean relative error of the proposed method are respectively reduced by 6.5% and 2.3%.

Reference | Related Articles | Metrics

Select

Adaptive bi-l _p-l ₂-norm based blind super-resolution reconstruction for single blurred image

LI Tao, HE Xiaohai, TENG Qizhi, WU Xiaoqiang

Journal of Computer Applications 2017, 37 (8): 2313-2318. DOI: 10.11772/j.issn.1001-9081.2017.08.2313

Abstract （521）

PDF （972KB）（582）

Save

An adaptive bi- l _p- l ₂-norm based blind super-resolution reconstruction method was proposed to improve the quality of a low-resolution blurred image, which includes independent blur-kernel estimation sub-process and non-blind super-resolution reconstruction sub-process. In the blur-kernel estimation sub-process, the bi- l _p- l ₂-norm regularization was imposed on both the sharp image and the blur-kernel. Moreover, by introducing threshold segmentation of image gradients, the l _p-norm and the l ₂-norm constraints on the sharp image were adaptively combined. With the estimated blur-kernel, the non-blind super-resolution reconstruction method based on non-locally centralized sparse representation was used to reconstruct the final high-resolution image. In the simulation experiments, compared with the bi- l ₀- l ₂-norm based method, the average Peak Signal-to-Noise Ratio (PSNR) gain of the proposed method was 0.16 dB higher, the average Structural Similarity Index Measure (SSIM) gain was 0.0045 higher, and the average reduction of Sum of Squared Difference (SSD) ratio was 0.13 lower. The experimental results demonstrate a superior performance of the proposed method in terms of kernel estimation accuracy and reconstructed image quality.

Reference | Related Articles | Metrics

Select

Adaptive video super-resolution reconstruction algorithm based on multi-order derivative

JI Xiaohong, XIONG Shuhua, HE Xiaohai, CHEN Honggang

Journal of Computer Applications 2016, 36 (4): 1092-1095. DOI: 10.11772/j.issn.1001-9081.2016.04.1092

Abstract （456）

PDF （717KB）（413）

Save

The traditional video super-resolution reconstruction algorithm cannot preserve the details of the image edge effectively while removing the noise. In order to solve this problem, a video super-resolution reconstruction algorithm combining adaptive regularization term with multi-order derivative data item was put forward. Based on the regularization reconstruction model, the multi-order derivative of the noise, which described the statistical characteristics of the noise well, was introduced into the improved data item; meanwhile, Total Variation (TV) and Non-Local Mean (NLM) which has better denoising effect were used as the regularization items to constrain the reconstruction process. In addition, to preserve the details better, the coefficient of regularization was weighted adaptively according to the structural information, which was extracted by the regional spatially adaptive curvature difference algorithm. In the comparison experiments with the kernel-regression algorithm and the clustering algorithm when the noise variance is 3, the video reconstructed by the proposed algorithm has sharper edge, the structure is more accurate and clear; and the average Mean Squared Error (MSE) is decreased by 25.75% and 22.50% respectively; the Peak Signal-to-Noise Ratio (PSNR) is increased by 1.35 dB and 1.14 dB respectively. The results indicate that the proposed algorithm can effectively preserve the details of the image while removing the noise.

Reference | Related Articles | Metrics

Select

Wavelet domain distributed depth map video coding based on non-uniform quantization

CHEN Zhenzhen, QING Linbo, HE Xiaohai, WANG Yun

Journal of Computer Applications 2016, 36 (4): 1080-1084. DOI: 10.11772/j.issn.1001-9081.2016.04.1080

Abstract （485）

PDF （734KB）（388）

Save

In order to improve the decoding quality of depth map video in Distributed Multi-view Video plus Depth (DMVD) coding, a new non-uniform quantization scheme based on the sub-band layer and sub-band coefficients was proposed in wavelet domain Distributed Video Coding (DVC). The main idea was allocating more bits to pixels belong to the edge of depth map and consequently improving the quality of the depth map. According to the distribution characteristics of the wavelet coefficients of depth map, the low frequency wavelet coefficients of layer- N kept the uniform quantization scheme, while the high frequency wavelet coefficients of all layers used the non-uniform quantization scheme. For the high frequency wavelet coefficients around "0", larger quantization step was adopted. As the amplitude of the high frequency wavelet coefficients increased, the quantization step decreased, with finer quantization and the quality of the edge was improved consequently. The experimental results show that, for "Dancer" and "PoznanHall2" depth sequence with more edges, the proposed scheme can achieve up to 1.2 dB in terms of the Rate-Distortion (R-D) performance improvement by improving the quality of edges; for "Newspaper" and "Balloons" depth sequences with less edges, the proposed scheme can still get 0.3 dB of the R-D performance.

Reference | Related Articles | Metrics

Select

Rock classification of multi-feature fusion based on collaborative representation

LIU Juexian, TENG Qizhi, WANG Zhengyong, HE Xiaohai

Journal of Computer Applications 2016, 36 (3): 854-858. DOI: 10.11772/j.issn.1001-9081.2016.03.854

Abstract （482）

PDF （754KB）（453）

Save

To solve the issues of time-consuming and low recognition rate in the traditional component analysis of rock slices, a method of component analysis of rock slices based on Collaborative Representation (CR) was proposed. Firstly, texture feature of grain in rock slices was discussed, and the way of combining Hierarchical Multi-scale Local Binary Pattern (HMLBP) and Gray Level Co-occurrence Matrix (GLCM) was proved to characterize the texture of grain in rock slices well. Then, in order to reduce the time complexity of classification, the dimension of new features was reduced to 100 by using Principal Component Analysis (PCA). Finally, the Collaborative Representation based Classification (CRC) was used as the classifier. Differ to Sparse Representation based Classification (SRC), prediction samples were encoded by all the samples in train dictionary collaboratively instead of some single sample alone. Same attributes of different samples can improve the recognition rate. The experimental results show that the recognition speed of the method increases by 300% and the recognition rate of the method increases by 2% compared to SRC. In practical application, it can distinguish quartz and feldspar components in rock slices well.

Reference | Related Articles | Metrics

Select

Adaptive shadow removal based on superpixel and local color constancy

LAN Li, HE Xiaohai, WU Xiaohong, TENG Qizhi

Journal of Computer Applications 2016, 36 (10): 2837-2841. DOI: 10.11772/j.issn.1001-9081.2016.10.2837

Abstract （415）

PDF （746KB）（388）

Save

In order to remove the moving cast shadow in the surveillance video quickly and efficiently, an adaptive shadow elimination method based on superpixel and local color constancy of shaded area was proposed. First, the improved simple linear iterative clustering algorithm was used to divide the moving area in the video image into non-overlapping superpixels. Then, the luminance ratio of background and the moving foreground in the RGB color space was calculated, and the local color constancy of shaded area was analyzed. Finally, the standard deviation of the luminance ratio was computed by taking superpixel as basic processing unit, and an adaptive threshold algorithm based on turning point according to the characteristic and distribution of the standard deviation of the shadowed region was proposed to detect and remove the shadow. Experimental results show that the proposed method can process shadows in different scenarios, the shadow detection rate and discrimination rate are both more than 85%; meanwhile, the computational cost is greatly reduced by using the superpixel, and the average processing time per frame is 20 ms. The proposed algorithm can satisfy the shadow removal requirements of higher precision, real-time and robustness.

Reference | Related Articles | Metrics

Select

Improved enhancement algorithm of fog image based on multi-scale Retinex with color restoration

LI Yaofeng HE Xiaohai WU Xiaoqiang

Journal of Computer Applications 2014, 34 (10): 2996-2999. DOI: 10.11772/j.issn.1001-9081.2014.10.2996

Abstract （262）

PDF （828KB）（539）

Save

An improved method for Multi-Scale Retinex with Color Restoration (MSRCR) algorithm was proposed, to remove the fog at the far prospect and solve gray hypothesis problem. First, original fog image was inverted. Then, MSRCR algorithm was used on it. The inverted image was to be inverted again and then was linearly superposed with the result which was processed by MSRCR algorithm directly .At the same time , the reflection component which was got during the process of the extraction was linearly superposed with the original luminance, and the mean and variance were calculated to decide the contrast stretching degree adaptively, finally, it was uniformly stretched to the display device.The experimental results show that the proposed algorithm can get a better effect of removing the fog. Evaluation values of the processed image, including standard difference, average brightness, information entropy, and squared gradient, are improved than the original algorithm. It is easy to implement and has important significance for real-time video to remove fog.

Reference | Related Articles | Metrics